Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Front Neurol ; 13: 965362, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36267885

RESUMO

Background and purpose: Distinguishing between intracranial atherosclerosis-related occlusion (ICAS-O) and non-ICAS-O can benefit strategies of identifying the need for surgical plans prior to thrombectomy. We investigated the association between vertebrobasilar artery calcification (VBAC) and ICAS-O in acute ischemic stroke patients undergoing thrombectomy. Methods: Patients were recruited from a prospective single-center registration study who had undergone thrombectomy between October 2017 and October 2021. The enrolled patients were divided into ICAS-O and non-ICAS-O, as determined by the intraarterial therapy process. The occurrences of VBAC were recorded on intracranial non-contrast computed tomography (NCCT) scans before thrombectomy. The association between VBAC and ICAS-O was assessed using binary logistic regression. Results: A total of 2732 patients who had undergone digital subtraction angiography were reviewed, and 314 thrombectomy patients (mean age: 65.4 years, 36.6% female) with NCCT were enrolled in this study. VBAC was detected before thrombectomy in 113 (36%) out of 314 patients. Age, hypertension, and diabetes were associated with VBAC, and a higher frequency of VBAC was identified in patients presenting posterior circulation. ICAS-O accounts for 43% (135/314) in eligible patients. From multivariable analyses, VBAC was identified as an independent predictor of ICAS-O (adjusted odds ratio, 6.16 [95% CI, 2.673-14.217], P < 0.001). Meanwhile, the (VBAC[+] atrial fibrillation[-]) group displayed higher rates of ICAS-O than the (VBAC[-] atrial fibrillation [-]) group (P < 0.001). Conclusions: We demonstrated that VBAC is an independent risk factor for ICAS-O in patients who underwent thrombectomy. Patients free of atrial fibrillation with VBAC have more trend to be ICAS-O.

2.
Nat Cell Biol ; 24(6): 928-939, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35618746

RESUMO

Most mammalian genes generate messenger RNAs with variable untranslated regions (UTRs) that are important post-transcriptional regulators. In cancer, shortening at 3' UTR ends via alternative polyadenylation can activate oncogenes. However, internal 3' UTR splicing remains poorly understood as splicing studies have traditionally focused on protein-coding alterations. Here we systematically map the pan-cancer landscape of 3' UTR splicing and present this in SpUR ( http://www.cbrc.kaust.edu.sa/spur/home/ ). 3' UTR splicing is widespread, upregulated in cancers, correlated with poor prognosis and more prevalent in oncogenes. We show that antisense oligonucleotide-mediated inhibition of 3' UTR splicing efficiently reduces oncogene expression and impedes tumour progression. Notably, CTNNB1 3' UTR splicing is the most consistently dysregulated event across cancers. We validate its upregulation in hepatocellular carcinoma and colon adenocarcinoma, and show that the spliced 3' UTR variant is the predominant contributor to its oncogenic functions. Overall, our study highlights the importance of 3' UTR splicing in cancer and may launch new avenues for RNA-based anti-cancer therapeutics.


Assuntos
Adenocarcinoma , Neoplasias do Colo , Regiões 3' não Traduzidas/genética , Adenocarcinoma/genética , Processamento Alternativo/genética , Animais , Carcinogênese/genética , Neoplasias do Colo/genética , Mamíferos , Regulação para Cima
3.
Comput Struct Biotechnol J ; 19: 3015-3026, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34136099

RESUMO

RNA modifications, in particular N 6-methyladenosine (m6A), participate in every stages of RNA metabolism and play diverse roles in essential biological processes and disease pathogenesis. Thanks to the advances in sequencing technology, tens of thousands of RNA modification sites can be identified in a typical high-throughput experiment; however, it remains a major challenge to decipher the functional relevance of these sites, such as, affecting alternative splicing, regulation circuit in essential biological processes or association to diseases. As the focus of RNA epigenetics gradually shifts from site discovery to functional studies, we review here recent progress in functional annotation and prediction of RNA modification sites from a bioinformatics perspective. The review covers naïve annotation with associated biological events, e.g., single nucleotide polymorphism (SNP), RNA binding protein (RBP) and alternative splicing, prediction of key sites and their regulatory functions, inference of disease association, and mining the diagnosis and prognosis value of RNA modification regulators. We further discussed the limitations of existing approaches and some future perspectives.

4.
Bioinformatics ; 37(22): 4277-4279, 2021 11 18.
Artigo em Inglês | MEDLINE | ID: mdl-33974000

RESUMO

MOTIVATION: N 6-methyladenosine (m6A) is the most abundant mammalian mRNA methylation with versatile functions. To date, although a number of bioinformatics tools have been developed for location discovery of m6A modification, functional understanding is still quite limited. As the focus of RNA epigenetics gradually shifts from site discovery to functional studies, there is an urgent need for user-friendly tools to identify and explore the functional relevance of context-specific m6A methylation to gain insights into the epitranscriptome layer of gene expression regulation. RESULTS: We introduced here Funm6AViewer, a novel platform to identify, prioritize and visualize the functional gene interaction networks mediated by dynamic m6A RNA methylation unveiled from a case control study. By taking the differential RNA methylation data and differential gene expression data, both of which can be inferred from the widely used MeRIP-seq data, as the inputs, Funm6AViewer enables a series of analysis, including: (i) examining the distribution of differential m6A sites, (ii) prioritizing the genes mediated by dynamic m6A methylation and (iii) characterizing functionally the gene regulatory networks mediated by condition-specific m6A RNA methylation. Funm6AViewer should effectively facilitate the understanding of the epitranscriptome circuitry mediated by this reversible RNA modification. AVAILABILITY AND IMPLEMENTATION: Funm6AViewer is available both as a convenient web server (https://www.xjtlu.edu.cn/biologicalsciences/funm6aviewer) with graphical interface and as an independent R package (https://github.com/NWPU-903PR/Funm6AViewer) for local usage. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Epigênese Genética , RNA , Animais , Metilação , Estudos de Casos e Controles , RNA/metabolismo , Redes Reguladoras de Genes , Adenosina/metabolismo , Mamíferos/genética
5.
Inorg Chem ; 59(22): 16582-16590, 2020 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-33113329

RESUMO

Several types of air-stable N,O-coordinate half-sandwich iridium complexes containing Schiff base ligands with the general formula [Cp*IrClL] were synthesized in good yields. These stable iridium complexes displayed a good catalytic efficiency in amide synthesis. A variety of amides with different substituents were obtained in a one-pot procedure with excellent yields and high selectivities through the amidation of aldehydes with NH2OH·HCl and nitrile hydration under the catalysis of complexes 1-4. The excellent and diverse catalytic activity, mild conditions, broad substance scope, and environmentally friendly solvent make this system potentially applicable in industrial production. Half-sandwich iridium complexes 1-4 were characterized by NMR, elemental analysis, and IR techniques. Molecular structures of complexes 2 and 3 were confirmed by single-crystal X-ray analysis.

6.
Int J Mol Sci ; 21(15)2020 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-32718000

RESUMO

Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes and human complex diseases. Distinguishing lncRNAs from protein-coding transcripts is a fundamental step for analyzing the lncRNA functional mechanism. However, the experimental identification of lncRNAs is expensive and time-consuming. In this study, we presented an alignment-free multimodal deep learning framework (namely lncRNA_Mdeep) to distinguish lncRNAs from protein-coding transcripts. LncRNA_Mdeep incorporated three different input modalities, then a multimodal deep learning framework was built for learning the high-level abstract representations and predicting the probability whether a transcript was lncRNA or not. LncRNA_Mdeep achieved 98.73% prediction accuracy in a 10-fold cross-validation test on humans. Compared with other eight state-of-the-art methods, lncRNA_Mdeep showed 93.12% prediction accuracy independent test on humans, which was 0.94%~15.41% higher than that of other eight methods. In addition, the results on 11 cross-species datasets showed that lncRNA_Mdeep was a powerful predictor for predicting lncRNAs.


Assuntos
Bases de Dados de Ácidos Nucleicos , Aprendizado Profundo , RNA Longo não Codificante/genética , Software , Animais , Humanos , Camundongos
7.
Anal Biochem ; 601: 113767, 2020 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-32454029

RESUMO

Long noncoding RNAs (lncRNAs) play critical roles in many pathological and biological processes, such as post-transcription, cell differentiation and gene regulation. Increasingly more studies have shown that lncRNAs function through mainly interactions with specific RNA binding proteins (RBPs). However, experimental identification of potential lncRNA-protein interactions is costly and time-consuming. In this work, we propose a novel convolutional neural network-based method with the copy-padding trick (named LPI-CNNCP) to predict lncRNA-protein interactions. The copy-padding trick of the LPI-CNNCP convert the protein/RNA sequences with variable-length into the fixed-length sequences, thus enabling the construction of the CNN model. A high-order one-hot encoding is also applied to transform the protein/RNA sequences into image-like inputs for capturing the dependencies among amino acids (or nucleotides). In the end, these encoded protein/RNA sequences are feed into a CNN to predict the lncRNA-protein interactions. Compared with other state-of-the-art methods in 10-fold cross-validation (10CV) test, LPI-CNNCP shows the best performance. Results in the independent test demonstrate that our LPI-CNNCP can effectively predict the potential lncRNA-protein interactions. We also compared the copy-padding trick with two other existing tricks (i.e., zero-padding and cropping), and the results show that our copy-padding rick outperforms the zero-padding and cropping tricks on predicting lncRNA-protein interactions. The source code of LPI-CNNCP and the datasets used in this work are available at https://github.com/NWPU-903PR/LPI-CNNCP for academic users.


Assuntos
Redes Neurais de Computação , RNA Longo não Codificante/química , Proteínas de Ligação a RNA/química , Sequência de Aminoácidos , Humanos
8.
Inorg Chem ; 59(7): 4800-4809, 2020 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-32212643

RESUMO

Several N,O-coordinate half-sandwich iridium complexes, 1-5, containing constrained bulky ß-enaminoketonato ligands were prepared and clearly characterized. Single-crystal X-ray diffraction characterization of these complexes indicates that the iridium center adopts a distorted octahedral geometry. Complexes 1-5 showed good catalytic efficiency in the oxidative homocoupling of primary amines, dehydrogenation of secondary amines, and the oxidative cross-coupling of amines and alcohols, which furnished various types of imines in good yields and high selectivities using O2 as an oxidant under mild conditions. No distinctive substituent effects of the iridium catalysts were observed in these reactions. The diverse catalytic activity, broad substrate scope, mild reaction conditions, and high yields of the products made this catalytic system attractive in industrial processes.

9.
Bioinformatics ; 35(14): i90-i98, 2019 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-31510685

RESUMO

MOTIVATION: As the most abundant mammalian mRNA methylation, N6-methyladenosine (m6A) exists in >25% of human mRNAs and is involved in regulating many different aspects of mRNA metabolism, stem cell differentiation and diseases like cancer. However, our current knowledge about dynamic changes of m6A levels and how the change of m6A levels for a specific gene can play a role in certain biological processes like stem cell differentiation and diseases like cancer is largely elusive. RESULTS: To address this, we propose in this paper FunDMDeep-m6A a novel pipeline for identifying context-specific (e.g. disease versus normal, differentiated cells versus stem cells or gene knockdown cells versus wild-type cells) m6A-mediated functional genes. FunDMDeep-m6A includes, at the first step, DMDeep-m6A a novel method based on a deep learning model and a statistical test for identifying differential m6A methylation (DmM) sites from MeRIP-Seq data at a single-base resolution. FunDMDeep-m6A then identifies and prioritizes functional DmM genes (FDmMGenes) by combing the DmM genes (DmMGenes) with differential expression analysis using a network-based method. This proposed network method includes a novel m6A-signaling bridge (MSB) score to quantify the functional significance of DmMGenes by assessing functional interaction of DmMGenes with their signaling pathways using a heat diffusion process in protein-protein interaction (PPI) networks. The test results on 4 context-specific MeRIP-Seq datasets showed that FunDMDeep-m6A can identify more context-specific and functionally significant FDmMGenes than m6A-Driver. The functional enrichment analysis of these genes revealed that m6A targets key genes of many important context-related biological processes including embryonic development, stem cell differentiation, transcription, translation, cell death, cell proliferation and cancer-related pathways. These results demonstrate the power of FunDMDeep-m6A for elucidating m6A regulatory functions and its roles in biological processes and diseases. AVAILABILITY AND IMPLEMENTATION: The R-package for DMDeep-m6A is freely available from https://github.com/NWPU-903PR/DMDeepm6A1.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Neoplasias , Mapas de Interação de Proteínas , RNA , Animais , Humanos , Metilação , Neoplasias/genética , RNA Mensageiro , Software
10.
BMC Bioinformatics ; 20(1): 225, 2019 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-31046665

RESUMO

BACKGROUND: Characterizing the modular structure of cellular network is an important way to identify novel genes for targeted therapeutics. This is made possible by the rising of high-throughput technology. Unfortunately, computational methods to identify functional modules were limited by the data quality issues of high-throughput techniques. This study aims to integrate knowledge extracted from literature to further improve the accuracy of functional module identification. RESULTS: Our new model and algorithm were applied to both yeast and human interactomes. Predicted functional modules have covered over 90% of the proteins in both organisms, while maintaining a comparable overall accuracy. We found that the combination of both mRNA expression information and biomedical knowledge greatly improved the performance of functional module identification, which is better than those only using protein interaction network weighted with transcriptomic data, literature knowledge, or simply unweighted protein interaction network. Our new algorithm also achieved better performance when comparing with some other well-known methods, especially in terms of the positive predictive value (PPV), which indicated the confidence of novel discovery. CONCLUSION: Higher PPV with the multiplex approach suggested that information from both sources has been effectively integrated to reduce false positive. With protein coverage higher than 90%, our algorithm is able to generate more novel biological hypothesis with higher confidence.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas/métodos , Análise por Conglomerados , Perfilação da Expressão Gênica , Genes Fúngicos , Humanos , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
11.
BMC Bioinformatics ; 20(1): 87, 2019 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-30782113

RESUMO

BACKGROUND: Long non-coding RNAs play an important role in human complex diseases. Identification of lncRNA-disease associations will gain insight into disease-related lncRNAs and benefit disease diagnoses and treatment. However, using experiments to explore the lncRNA-disease associations is expensive and time consuming. RESULTS: In this study, we developed a novel method to identify potential lncRNA-disease associations by Integrating Diverse Heterogeneous Information sources with positive pointwise Mutual Information and Random Walk with restart algorithm (namely IDHI-MIRW). IDHI-MIRW first constructs multiple lncRNA similarity networks and disease similarity networks from diverse lncRNA-related and disease-related datasets, then implements the random walk with restart algorithm on these similarity networks for extracting the topological similarities which are fused with positive pointwise mutual information to build a large-scale lncRNA-disease heterogeneous network. Finally, IDHI-MIRW implemented random walk with restart algorithm on the lncRNA-disease heterogeneous network to infer potential lncRNA-disease associations. CONCLUSIONS: Compared with other state-of-the-art methods, IDHI-MIRW achieves the best prediction performance. In case studies of breast cancer, stomach cancer, and colorectal cancer, 36/45 (80%) novel lncRNA-disease associations predicted by IDHI-MIRW are supported by recent literatures. Furthermore, we found lncRNA LINC01816 is associated with the survival of colorectal cancer patients. IDHI-MIRW is freely available at https://github.com/NWPU-903PR/IDHI-MIRW .


Assuntos
Algoritmos , Biologia Computacional/métodos , Predisposição Genética para Doença , RNA Longo não Codificante/genética , Neoplasias Colorretais/genética , Estudos de Associação Genética , Humanos , Análise de Sequência de RNA
12.
PLoS Comput Biol ; 15(1): e1006663, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30601803

RESUMO

N6-methyladenosine (m6A) is the most abundant methylation, existing in >25% of human mRNAs. Exciting recent discoveries indicate the close involvement of m6A in regulating many different aspects of mRNA metabolism and diseases like cancer. However, our current knowledge about how m6A levels are controlled and whether and how regulation of m6A levels of a specific gene can play a role in cancer and other diseases is mostly elusive. We propose in this paper a computational scheme for predicting m6A-regulated genes and m6A-associated disease, which includes Deep-m6A, the first model for detecting condition-specific m6A sites from MeRIP-Seq data with a single base resolution using deep learning and Hot-m6A, a new network-based pipeline that prioritizes functional significant m6A genes and its associated diseases using the Protein-Protein Interaction (PPI) and gene-disease heterogeneous networks. We applied Deep-m6A and this pipeline to 75 MeRIP-seq human samples, which produced a compact set of 709 functionally significant m6A-regulated genes and nine functionally enriched subnetworks. The functional enrichment analysis of these genes and networks reveal that m6A targets key genes of many critical biological processes including transcription, cell organization and transport, and cell proliferation and cancer-related pathways such as Wnt pathway. The m6A-associated disease analysis prioritized five significantly associated diseases including leukemia and renal cell carcinoma. These results demonstrate the power of our proposed computational scheme and provide new leads for understanding m6A regulatory functions and its roles in diseases.


Assuntos
Adenosina/análogos & derivados , Biologia Computacional/métodos , Marcadores Genéticos/genética , Neoplasias/genética , Software , Adenosina/genética , Algoritmos , Aprendizado Profundo , Humanos , Neoplasias/metabolismo , Mapas de Interação de Proteínas/genética
13.
PLoS One ; 13(9): e0203871, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30208101

RESUMO

Perturbing a signaling system with a serial of single gene deletions and then observing corresponding expression changes in model organisms, such as yeast, is an important and widely used experimental technique for studying signaling pathways. People have developed different computational methods to analyze the perturbation data from gene deletion experiments for exploring the signaling pathways. The most popular methods/techniques include K-means clustering and hierarchical clustering techniques, or combining the expression data with knowledge, such as protein-protein interactions (PPIs) or gene ontology (GO), to search for new pathways. However, these methods neither consider nor fully utilize the intrinsic relation between the perturbation of a pathway and expression changes of genes regulated by the pathway, which served as the main motivation for developing a new computational method in this study. In our new model, we first find gene transcriptomic modules such that genes in each module are highly likely to be regulated by a common signal. We then use the expression status of those modules as readouts of pathway perturbations to search for up-stream pathways. Systematic evaluation, such as through gene ontology enrichment analysis, has provided evidence that genes in each transcriptomic module are highly likely to be regulated by a common signal. The PPI density analysis and literature search revealed that our new perturbation modules are functionally coherent. For example, the literature search revealed that 9 genes in one of our perturbation module are related to cell cycle and all 10 genes in another perturbation module are related by DNA damage, with much evidence from the literature coming from in vitro or/and in vivo verifications. Hence, utilizing the intrinsic relation between the perturbation of a pathway and the expression changes of genes regulated by the pathway is a useful method of searching for signaling pathways using genetic perturbation data. This model would also be suitable for analyzing drug experiment data, such as the CMap data, for finding drugs that perturb the same pathways.


Assuntos
Perfilação da Expressão Gênica/métodos , Regulação Fúngica da Expressão Gênica/genética , Análise por Conglomerados , Biologia Computacional/métodos , Aprendizado Profundo , Ontologia Genética , Redes Reguladoras de Genes , Aprendizado de Máquina , Saccharomyces cerevisiae/genética , Transdução de Sinais/genética , Transcriptoma
14.
Materials (Basel) ; 11(4)2018 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-29642555

RESUMO

Abstract: Structure/material requires simultaneous consideration of both its design and manufacturing processes to dramatically enhance its manufacturability, assembly and maintainability. In this work, a novel design framework for structural/material with a desired mechanical performance and compelling topological design properties achieved using origami techniques is presented. The framework comprises four procedures, including topological design, unfold, reduction manufacturing, and fold. The topological design method, i.e., the solid isotropic material penalization (SIMP) method, serves to optimize the structure in order to achieve the preferred mechanical characteristics, and the origami technique is exploited to allow the structure to be rapidly and easily fabricated. Topological design and unfold procedures can be conveniently completed in a computer; then, reduction manufacturing, i.e., cutting, is performed to remove materials from the unfolded flat plate; the final structure is obtained by folding out the plate from the previous procedure. A series of cantilevers, consisting of origami parallel creases and Miura-ori (usually regarded as a metamaterial) and made of paperboard, are designed with the least weight and the required stiffness by using the proposed framework. The findings here furnish an alternative design framework for engineering structures that could be better than the 3D-printing technique, especially for large structures made of thin metal materials.

15.
Med Chem ; 13(6): 515-525, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28494725

RESUMO

BACKGROUND: RNA-protein interactions (RPIs) play an important role in many cellular processes. In particular, noncoding RNA-protein interactions (ncRPIs) are involved in various gene regulations and human complex diseases. High-throughput experiments have provided a large number of valuable information about ncRPIs, but these experiments are expensive and timeconsuming. Therefore, some computational approaches have been developed to predict ncRPIs efficiently and effectively. METHODS: In this work, we will describe the recent advance of predicting ncRPIs from the following aspects: i) the dataset construction; ii) the sequence and structural feature representation, and iii) the machine learning algorithm. RESULTS: The current methods have successfully predicted ncRPIs, but most of them trained and tested on the small benchmark datasets derived from ncRNA-protein complexes in PDB database. The generalization performance and robust of these existing methods need to be further improved. CONCLUSION: Concomitant with the large numbers of ncRPIs generated by high-throughput technologies, three future directions for predicting ncRPIs with machine learning should be paid attention. One direction is that how to effectively construct the negative sample set. Another is the selection of novel and effective features from the sequences and structures of ncRNAs and proteins. The third is the design of powerful predictor.


Assuntos
Biologia Computacional/métodos , Proteínas/metabolismo , RNA não Traduzido/metabolismo , Humanos , Internet , Aprendizado de Máquina , Ligação Proteica
16.
Mol Biosyst ; 11(3): 892-7, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25588719

RESUMO

Long noncoding RNAs (lncRNAs) are emerging as a novel class of noncoding RNAs and potent gene regulators, which play an important and varied role in cellular functions. lncRNAs are closely related with the occurrence and development of some diseases. High-throughput RNA-sequencing techniques combined with de novo assembly have identified a large number of novel transcripts. The discovery of large and 'hidden' transcriptomes urgently requires the development of effective computational methods that can rapidly distinguish between coding and long noncoding RNAs. In this study, we developed a powerful predictor (named as lncRNA-MFDL) to identify lncRNAs by fusing multiple features of the open reading frame, k-mer, the secondary structure and the most-like coding domain sequence and using deep learning classification algorithms. Using the same human training dataset and a 10-fold cross validation test, lncRNA-MFDL can achieve 97.1% prediction accuracy which is 5.7, 3.7, and 3.4% higher than that of CPC, CNCI and lncRNA-FMFSVM predictors, respectively. Compared with CPC and CNCI predictors in other species (e.g., anole lizard, zebrafish, chicken, gorilla, macaque, mouse, lamprey, orangutan, xenopus and C. elegans) testing datasets, the new lncRNA-MFDL predictor is also much more effective and robust. These results show that lncRNA-MFDL is a powerful tool for identifying lncRNAs. The lncRNA-MFDL software package is freely available at for academic users.


Assuntos
Biologia Computacional/métodos , RNA Longo não Codificante , Software , Algoritmos , Humanos , RNA Longo não Codificante/química , RNA Longo não Codificante/genética , Reprodutibilidade dos Testes
17.
Anal Biochem ; 449: 164-71, 2014 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-24361712

RESUMO

Revealing the subcellular location of newly discovered protein sequences can bring insight to their function and guide research at the cellular level. The rapidly increasing number of sequences entering the genome databanks has called for the development of automated analysis methods. Currently, most existing methods used to predict protein subcellular locations cover only one, or a very limited number of species. Therefore, it is necessary to develop reliable and effective computational approaches to further improve the performance of protein subcellular prediction and, at the same time, cover more species. The current study reports the development of a novel predictor called MSLoc-DT to predict the protein subcellular locations of human, animal, plant, bacteria, virus, fungi, and archaea by introducing a novel feature extraction approach termed Amino Acid Index Distribution (AAID) and then fusing gene ontology information, sequential evolutionary information, and sequence statistical information through four different modes of pseudo amino acid composition (PseAAC) with a decision template rule. Using the jackknife test, MSLoc-DT can achieve 86.5, 98.3, 90.3, 98.5, 95.9, 98.1, and 99.3% overall accuracy for human, animal, plant, bacteria, virus, fungi, and archaea, respectively, on seven stringent benchmark datasets. Compared with other predictors (e.g., Gpos-PLoc, Gneg-PLoc, Virus-PLoc, Plant-PLoc, Plant-mPLoc, ProLoc-Go, Hum-PLoc, GOASVM) on the gram-positive, gram-negative, virus, plant, eukaryotic, and human datasets, the new MSLoc-DT predictor is much more effective and robust. Although the MSLoc-DT predictor is designed to predict the single location of proteins, our method can be extended to multiple locations of proteins by introducing multilabel machine learning approaches, such as the support vector machine and deep learning, as substitutes for the K-nearest neighbor (KNN) method. As a user-friendly web server, MSLoc-DT is freely accessible at http://bioinfo.ibp.ac.cn/MSLOC_DT/index.html.


Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Proteínas/análise , Frações Subcelulares/química , Sequência de Aminoácidos , Animais , Bases de Dados de Proteínas , Ontologia Genética , Humanos , Dados de Sequência Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...